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; Abstract 

Wc first review existing sequential mctliods for estimating a binomial proportion. After- 
. ward, wc propose a new family of group sequential sampling schemes for estimating a binomial 

I proportion with prescribed margin of error and confidence level. In particular, we establish the 

' uniform controllability of coverage probability and the asymptotic optimality for such a family 

^ . of sampling schemes. Our theoretical results establish the possibility that the parameters of this 

family of sampling schemes can be determined so that the prescribed level of confidence is guar- 
anteed with little waste of samples. Analytic bounds for the cumulative distribution functions 
and expectations of sample numbers are derived. Moreover, we discuss the inherent connection 
^ , of various sampling schemes. Numerical issues are addressed for improving the accuracy and 

r*!" ' efficiency of computation. Computational experiments are conducted for comparing sampling 

^ ■ schemes. Illustrative examples are given for applications in clinical trials. 

cn 

^ : 1 Introduction 

cn _ 

Estimating a binomial proportion is a problem of ubiquitous significance in many areas of engi- 
neering and sciences. For economical reasons and other concerns, it is important to use as fewer as 
^ ! possible samples to guarantee the required reliability of estimation. To achieve this goal, sequential 

I sampling schemes can be very useful. In a sequential sampling scheme, the total number of observa- 

tions is not fixed in advance. The sampling process is continued stage by stage until a pre-specified 
stopping rule is satisfied. The stopping rule is evaluated with accumulated observations. In many 
applications, for administrative feasibility, the sampling experiment is performed in a group fash- 
ion. Similar to group sequential tests [H Section 8], an estimation method based on taking 
samples by groups and evaluating them sequentially is referred to as a group sequential estimation 
method. It should be noted that group sequential estimation methods are general enough to include 
fixed-sample-size and fully sequential procedures as special cases. Particularly, a fixed-sample-size 
method can be viewed as a group sequential procedure of only one stage. If the increment between 
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the sample sizes of consecutive stages is equal to 1, then the group sequential method is actually a 
fully sequential method. 

It is a common contention that statistical inference, as a unique science to quantify the uncer- 
tainties of inferential statements, should avoid errors in the quantification of uncertainties, while 
minimizing the sampling cost. That is, a statistical inferential method is expected to be exact 
and efficient. The conventional notion of exactness is that no approximation is involved, except 
the roundoff error due to finite word length of computers. Existing sequential methods for esti- 
mating a binomial proportion are dominantly of asymptotic nature (see, e.g., [5l [23l [231 EH E2] 
and the references therein). Undoubtedly, asymptotic techniques provide approximate solutions 
and important insights for the relevant problems. However, any asymptotic method inevitably 
introduces unknown error in the resultant approximate solution due to the necessary use of a finite 
number of samples. In the direction of non- asymptotic sequential estimation, the primary goal 
is to ensure that the true coverage probability is above the pre-specified confidence level for any 
value of the associated parameter, while the required sample size is as low as possible. In this 
direction, Mendo and Hernando [30] developed an inverse binomial sampling scheme for estimat- 
ing a binomial proportion with relative precision. Tanaka [33] developed a rigorous method for 
constructing fixed-width sequential confidence intervals for a binomial proportion. Although no 
approximation is involved, Tanaka's method is very conservative due to the bounding techniques 
employed in the derivation of sequential confidence intervals. Franzen [20] studied the construc- 
tion of fixed-width sequential confidence intervals for a binomial proportion. However, no effective 
method for defining stopping rules is proposed in [20]. In his later paper [21], Franzen proposed 
to construct fixed-width confidence intervals based on sequential probability ratio tests (SPRTs) 
invented by Wald [M]. His method can generate fixed-sample-size confidence intervals based on 
SPRTs. Unfortunately, he made a fundamental fiaw by mistaking that if the width of the fixed- 
sample-size confidence interval decreases to be smaller than the pre-specified length as the number 
of samples is increasing, then the fixed-sample-size confidence interval at the termination of sam- 
pling process is the desired fixed-width sequential confidence interval guaranteeing the prescribed 
confidence level. More recently, Jesse Frey published a paper [22] in The American Statistician 
(TAS) on the classical problem of sequentially estimating a binomial proportion with prescribed 
margin of error and confidence level. Before Frey submitted his original manuscript to TAS in 
July 2009, a general framework of multistage parameter estimation had been established by Chen 
[SI [SI [T0| [T2| [T3] , which provides exact methods for estimating parameters of common distributions 
with various error criterion. This framework is also proposed in [TJ]. The approach of Frey [22] is 
similar to that of Chen [61(81 110 1 112^ 113] ^or the specific problem of estimating a binomial proportion 
with prescribed margin of error and confidence level. 

In this paper, our primary interests are in the exact sequential methods for the estimation of 
a binomial proportion with prescribed margin of error and confidence level. We first introduce the 
exact approach established in [6l [HI [ini [El US] . In particular, we introduce the inclusion principle 
proposed in [13] and its applications to the construction of concrete stopping rules. We investigate 
the connection among various stopping rules. Afterward, we propose a new family of stopping rules 
which are extremely simple and accommodate some existing stopping rules as special cases. We 
provide rigorous justification for the feasibility and asymptotic optimality of such stopping rules. We 
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prove that the prescribed confidence level can be guaranteed uniformly for all values of a binomial 
proportion by choosing appropriate parametric values for the stopping rule. We show that as the 
margin of error tends to zero, the sample size tends to the attainable minimum as if the binomial 
proportion were exactly known. We derive analytic bounds for distributions and expectations of 
sample numbers. In addition, we address some critical computational issues and propose methods 
to improve the accuracy and efficiency of numerical calculation. We conduct extensive numerical 
experiment to study the performance of various stopping rules. We determine parametric values 
for the proposed stopping rules to achieve unprecedentedly efficiency while guaranteeing prescribed 
confidence levels. We attempt to make our proposed method as user-friendly as possible so that it 
can be immediately applicable even for layer persons. 

The remainder of the paper is organized as follows. In Section 2, we introduce the exact approach 
proposed in [U [HI [lOl [12l [13] . In Section 3, we discuss the general principle of constructing stopping 
rules. In Section 4, we propose a new family of sampling schemes and investigate their feasibility, 
optimality and analytic bounds of the distribution and expectation of sample numbers. In Section 5, 
we compare various computational methods. In particular, we illustrate why the natural method of 
evaluating coverage probability based on gridding parameter space is neither rigorous nor efficient. 
In Section 6, we present numerical results for various sampling schemes. In Section 7, we illustrate 
the applications of our group sequential method in clinical trials. Section 8 is the conclusion. The 
proofs of theorems are given in appendices. Throughout this paper, we shall use the following 
notations. The empty set is denoted by 0. The set of positive integers is denoted by N. The ceiling 
function is denoted by [.] . The notation Pr{E \ 9} denotes the probability of the event E associated 
with parameter 9. The expectation of a random variable is denoted by IE[.]. The standard normal 
distribution is denoted by $(.). For a £ (0,1), the notation denotes the critical value such 
that ^{Za) = 1 — a. For n G N, in the case that Xi, ■ ■ ■ ,Xn are i.i.d. samples of X, we denote 
the sample mean — - by Xn, which is also called the relative frequency when X is a Bernoulli 
random variable. The other notations will be made clear as we proceed. 



In many areas of scientific investigation, the outcome of an experiment is of dichotomy nature and 
can be modeled as a Bernoulli random variable X, defined in probability space (0,Pr, ^), such 



where p is referred to as a binomial proportion. In general, there is no analytical method for 
evaluating the binomial proportion p. A frequently-used approach is to estimate p based on i.i.d. 
samples Xi, X2, ■ ■ ■ of X. To reduce the sampling cost, it is appropriate to estimate p by a 
multistage sampling procedure. More formally, let e S (0, 1) and 1 — S, with S E (0, 1), be the 
pre-specified margin of error and confidence level respectively. The objective is to construct a 
sequential estimator p for p based on a multistage sampling scheme such that 



2 How Can It Be Exact? 



that 



Pr{X = 1} = 1 - Pv{X = 0} = p G (0, 1) 



Pr{|p — p\<e\p}>l — 5 
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for any p G (0,1). Throughout this paper, the probabihty Pr{|p — p\ < e \ p} is referred to as 
the coverage probability. Accordingly, the probabihty Pr{|p — p\ > e \ p} is referred to as the 
complementary coverage probability. Clearly, a complete construction of a multistage estimation 
scheme needs to determine the number of stages, the sample sizes for all stages, the stopping rule, 
and the estimator for p. Throughout this paper, we let s denote the number of stages and let 
denote the number of samples at the i-th stages. That is, the sampling process consists of s stages 
with sample sizes ni < 77-2 < • • • < rig. For i = 1,2, ■ ■ ■ ,s, define Ki = '^^li Xi and = The 
stopping rule is to be defined in terms of p^, £ = 1, ■ ■ ■ , s. Of course, the index of stage at the 
termination of the sampling process, denoted by I, is a random number. Accordingly, the number 
of samples at the termination of the experiment, denoted by n, is a random number which equals 
ni. Since for each £, is a maximum-likelihood and minimum- variance unbiased estimator of p, 
the sequential estimator for p is taken as 

p = p, = ^^^^ = l^^. (2) 
ni n 

In the above discussion, we have outlined the general characteristics of a multistage sampling 
scheme for estimating a binomial proportion. It remains to determine the number of stages, the 
sample sizes for all stages, and the stopping rule so that the resultant estimator p satisfies ([1]) for 
any p e (0, 1). 

Actually, the problem of sequential estimation of a binomial proportion has been treated by 
Chen [6l [HI [TOl [T2l [T3] in a general framework of multistage parameter estimation. The techniques 
of [6l [SI [TOl [T2I [T3] are sufficient to offer exact solutions for a wide range of sequential estimation 
problems, including the estimation of a binomial proportion as a special case. The central idea 
of the approach in [H |8l HOl [121 [E] is the control of coverage probability by a single parameter 
C, referred to as the coverage tuning parameter, and the adaptive rigorous checking of coverage 
guarantee by virtue of bounds of coverage probabilities. It is recognized in [BJ [Sj [THl [El [13] that, 
due to the discontinuity of the coverage probability on parameter space, the conventional method 
of evaluating the coverage probability for a finite number of parameter values is neither rigorous 
not computationally efficient for checking the coverage probability guarantee. 

As mentioned in the introduction, Frey published an article [22] in TAS on the sequential 
estimation of a binomial proportion with prescribed margin of error and confidence level. For 
clarity of presentation, the comparison of the works of Chen and Frey is given in Section 5.4. In 
the remainder of this section, we shall only introduce the idea and techniques of [H [HI [lOl [121 [13] , 
which had been precedentially developed by Chen before Frey submitted his original manuscript to 
TAS in July 2009. We will introduce the approach of [HI [El [lOl Il2l [13] with a focus on the special 
problem of estimating a binomial proportion with prescribed margin of error and confidence level. 

2.1 Four Components Suffice 

The exact methods of [H [H [TOl W2\ [13] for multistage parameter estimation have four main com- 
ponents as follows: 

(I) Stopping rules parameterized by the coverage tuning parameter C > such that the associated 
coverage probabilities can be made arbitrarily close to 1 by choosing ^ > to be a sufficiently small 
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number. 



(II) Recursively computable lower and upper bounds for the complementary coverage probability 
for a given ( and an interval of parameter values. 

(III) Adapted Branch and Bound Algorithm. 

(IV) Bisection coverage tuning. 

Without looking at the technical details, one can see that these four components are sufficient 
for constructing a sequential estimator so that the prescribed confidence level is guaranteed. The 
reason is as follows: As lower and upper bounds for the complementary coverage probability are 
available, the global optimization technique. Branch and Bound (B&B) Algorithm [28], can be used 
to compute exactly the maximum of complementary coverage probability on the whole parameter 
space. Thus, it is possible to check rigorously whether the coverage probability associated with 
a given ^ is no less than the pre-specified confidence level. Since the coverage probability can be 
controlled by ^, it is possible to determine C as large as possible to guarantee the desired confidence 
level by a bisection search. This process is referred to as bisection coverage tuning in [6t[8l ll0p i2 1 ll3j. 
Since a critical subroutine needed for bisection coverage tuning is to check whether the coverage 
probability is no less than the pre-specified confidence level, it is not necessary to compute exactly 
the maximum of the complementary coverage probability. Therefore, Chen revised the standard 
B&B algorithm to reduce the computational complexity and called the improved algorithm as 
the Adapted B&B Algorithm. The idea is to adaptively partition the parameter space as many 
subintervals. If for all subintervals, the upper bounds of the complementary coverage probability 
are no greater than 5, then declare that the coverage probability is guaranteed. If there exists a 
subinterval for which the lower bound of the complementary coverage probability is greater than 6, 
then declare that the coverage probability is not guaranteed. Continue partitioning the parameter 
space if no decision can be made. The four components are illustrated in the sequel under the 
headings of stopping rules, interval bounding, adapted branch and bound, and bisection coverage 
tuning. 

2.2 Stopping Rules 

The first component for the exact sequential estimation of a binomial proportion is the stopping 
rule for constructing a sequential estimator such that the coverage probability can be controlled 
by the coverage tuning parameter For convenience of describing some concrete stopping rules, 
define 



where k and I are integers such that < k < I < n. Assume that < ("(^ < 1. For the purpose of 
controlling the coverage probability Pr{|p — p\ < e | p} by the coverage tuning parameter, Chen 



zlnf + (1 - z)lni5| forze (0, 1) and 6* G (0,1) 

ln(l- 61) forz==Oande'e (0, 1), 

hie for z = 1 and 61 e (0, 1), 

-oo for z e [0, 1] and ^ (0, 1) 



and 
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has proposed four stopping rules as folfows: 

Stopping Rule A: Continue sampling until — \ \ — P(\, \ — \ \ — Pi\ + e) < ^^1^^ for some 

iG{l,---,s}. 

Stopping Rule B: Continue sampling until ~ ^1 ~ f^)^ — i + 2in(X5) some £ G {1, • • • ,s}. 
Stopping Rule C: Continue sampling until S{K£,n£,n£,Pi — e) < (5 and S{0, Ki,n£,pi + e) < (5 
for some £ G {1, • • • ,s}. 

Stopping Rule D: Continue sampling until > — Pi)-^ In ^ for some i G {1, • • • , s}. 

Stopping Rule A was first proposed in [6l Theorem 7] and restated in [8l Theorem 16]. Stopping 
Rule B was first proposed in [lOl Theorem 1] and represented as the third stopping rule in [9l Section 
4.1.1]. Stopping Rule C originated from [12j Theorem 1] and was restated as the first stopping rule 
in O Section 4.1.1]. Stopping Rule D was described in the remarks following Theorem 7 of [7]. All 
these stopping rules can be derived from the general principles proposed in \13\ Section 3] and [141 
Section 2.4]. 

Given that a stopping rule can be expressed in terms of p^ and for £ = 1, - ■ ■ , s, it is possible 
to find a bivariate function ^{., .) on {(z, n) : z G [0, 1], n G N}, taking values from {0, 1}, such that 
the stopping rule can be stated as: Continue sampling until S>(j)i, rit) = 1 for some ^ G {1, • • • , s}. 
It can be checked that such representation applies to Stopping Rules A, B, C, and D. For example, 
Stopping Rule B can be expressed in this way by virtue of function ^(., .) such that 

\i{\z-\\-lef>\ 



^{Z,n)={ ^' 2' 3W - 4 ' 21„(C5)' 

otherwise 

The motivation of introducing function &{.,.) is to parameterize the stopping rule in terms of 
design parameters. The function ^(., .) determines the form of the stopping rule and consequently, 
the sample sizes for all stages can be chosen as functions of design parameters. Specifically, let 

-^min = niin |n e N : ^ ^-,71^ = 1 for some nonnegative integer k not exceeding , (3) 
neN:^(— ,ri,) =1 for all nonnegative integer k not exceeding nl . (4) 



To avoid unnecessary checking of the stopping criterion and thus reduce administrative cost, there 
should be a possibility that the sampling process is terminated at the first stage. Hence, the 
minimum sample size ni should be chosen to ensure that {n = ni} ^ 0. This implies that the 
sample size ni for the first stage can be taken as Amin- On the other hand, since the sampling 
process must be terminated at or before the s-th stage, the maximum sample size Ug should be 
chosen to guarantee that {n > n^} = 0. This implies that the sample size Us for the last stage can 
be taken as Amax- If the number of stages s is given, then the sample sizes for stages in between 
1 and s can be chosen as s — 2 integers between Amin and Amax- Specially, if the group sizes are 
expected to be approximately equal, then the sample sizes can be taken as 



£-1 

S — 1 



l,---,s. (5) 



Since the stopping rule is associated with the coverage tuning parameter it follows that the 
number of stages s and the sample sizes rei,n2, • • • ,ns can be expressed as functions of C- In this 
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sense, it can be said that the stopping rule is parameterized by the coverage tuning parameter 
The above method of parameterizing stopping rules has been used in [51 [51 \TU[ [T^ and proposed in 
[HI Section 2.1, page 9]. 

2.3 Interval Bounding 

The second component for the exact sequential estimation of a binomial proportion is the method 
of bounding the complementary coverage probability Pr{|p — p\ > e \ p} for p in an interval [a, b] 
contained by interval (0, 1). Applying Theorem 8 of to the special case of a Bernoulli distribution 
immediately yields 

Pr{p < a-£ I 6} + Pr{p > b + e \ a} < Pt{\p - p\ > e \ p} < Pr{p <b~e\ a} +Pr{p >a + s\b} (6) 

for all p G [a, b] C (0, 1). The bounds of (|6|) can be shown as follows: Note that Pr{p < a — e \ 
p} + Fv{p >b + e\p}< Pr{\p — p\ > e \ p} = Pr{p < p — e \ p} + Fr{p > p + e \ p} < Pr{p <b — e\ 
p} + Pr{p > a + e \ p} for p £ [a,b] ^(0,1). As a consequence of the monotonicity of Pr{p > \ p} 
and Pr{p < "!? | p} with respect to p, where is a real number independent of p, the lower and upper 
bounds of Pr{|p— p| > e \ p} ior p £ [a, b] C (0, 1) can be given as Pr{p < a — e \ 6} + Pr{p > b+e \ a} 
and Pr{p < b — e \ a} + Pr{p > a + e \ b} respectively. 

In page 15, equation (1) of 0, Chen proposed to apply the recursive method of Schultz |311 
Section 2] to compute the lower and upper bounds of Pr{|p — p| > e \ p} given by (l6|). It should be 
pointed out that such lower and upper bounds of Pr{|p — p\ > e \ p} can also be computed by the 
recursive path-counting method of Franzen j20l page 49] . 

2.4 Adapted Branch and Bound 

The third component for the exact sequential estimation of a binomial proportion is the Adapted 
B&B Algorithm, which was proposed in [51 Section 2.8], for quick determination of whether the 
coverage probability is no less than 1 — 5 for any value of the associated parameter. Such a 
task of checking the coverage probability is also referred to as checking the coverage probability 
guarantee. Given that lower and upper bounds of the complementary coverage probability on 
an interval of parameter values can be obtained by the interval bounding techniques, this task 
can be accomplished by applying the B&B Algorithm [25] to compute exactly the maximum of 
the complementary coverage probability on the parameter space. However, in our applications, it 
suffices to determine whether the maximum of the complementary coverage probability Pr{|p— p| > 
£ I p} with respect to p € (0, 1) is greater than the confidence parameter 6. For fast checking 
whether the maximal complementary coverage probability exceeds 6, Chen proposed to reduce the 
computational complexity by revising the standard B&B Algorithm as the Adapted B&B Algorithm 
in [8l Section 2.8]. To describe this algorithm, let Xinit denote the parameter space (0, 1). For an 
interval X C Xinit, let max^'(X) denote the maximum of the complementary coverage probability 
Pr{|p — p\ > e \ p} with respect to p £ I. Let ^'ib(X) and ^'ub(X) be respectively the lower and 
upper bounds of ^'(X), which can be obtained by the interval bounding techniques introduced in 
Section 12. 3i Let > be a pre-specified tolerance, which is much smaller than 6. The Adapted 
B&B Algorithm of [8j is represented with a slight modification as follows. 
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V Let /c ^ 0, lo ^ *ib(2imt) and uq ^ *ub(2^mit)- 

V Let o?o {2init} if uq > 6. Otherwise, let =5^o be empty. 

V While is nonempty, Ik < 6 and Uk is greater than max{/fc + r/, 5}, do the following: 

o Split each interval in J/'^ as two new intervals of equal length. 

Let Sk denote the set of all new intervals obtained from this splitting procedure, 
o Eliminate any interval I from Sk such that ^'ub(2^) ^ 
o Let ^k+i be the set Sk processed by the above elimination procedure, 
o Let Ik+i maxig^^_^^ ^ib(2^) and Uk+i ^ maxi^y^^^ ^ub(2^)- Let A: ^ A: + 1. 

V If ^k is empty and Ik < 6, then declare max^(Iinit) < S. 
Otherwise, declare max^(Xinit) > S. 

It should be noted that for a sampling scheme of symmetrical stopping boundary, the initial 
interval Imit may be taken as (0, ^) for the sake of efficiency. In Section 5.1, we will illustrate why 
the Adapted B&B Algorithm is superior than the direct evaluation based on gridding parameter 
space. As will be seen in Section 5.2, the objective of the Adapted B&B Algorithm can also be 
accomplished by the Adaptive Maximum Checking Algorithm due to Chen [9l Section 3.3 ] and 
rediscovered by Frey in the second revision of his manuscript submitted to TAS in April 2010 
|22^ Appendix]. An explanation is given in Section 5.3 for the advantage of working with the 
complementary coverage probability. 

2.5 Bisection Coverage Tuning 

The fourth component for the exact sequential estimation of a binomial proportion is Bisection 
Coverage Tuning. Based on the adaptive rigorous checking of coverage probability, Chen proposed 
in [6l Section 2.7] and [SI Section 2.6] to apply a bisection search method to determine maximal C 
such that the coverage probability is no less than 1 — 5 for any value of the associated parameter. 
Moreover, Chen has developed asymptotic results in [8l page 21, Theorem 18] for determining the 
initial interval of ^ needed for the bisection search. Specifically, if the complementary coverage 
probability Pr{|p — p\ > e \ p} associated with C = Co tends to (5 as e —t- 0, then the initial interval 
of ( can be taken as [Co2*, Co2*~''^], where i is the largest integer such that the complementary 
coverage probability associated with ( = Co2* is no greater than 6 for all p £ (0, 1). By virtue of a 
bisection search, it is possible to obtain C* G [Co2*, Co2*'''^] such that the complementary coverage 
probability associated with = is guaranteed to be no greater than 6 for all p £ (0, 1). 

3 Principle of Constructing Stopping Rules 

In this section, we shall illustrate the inherent connection between various stopping rules. It will 
be demonstrated that a lot of stopping rules can be derived by virtue of the inclusion principle 
proposed by Chen [131 Section 3]. 
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3.1 Inclusion Principle 

The problem of estimating a binomial proportion can be considered as a special case of parameter 
estimation for a random variable X parameterized by 6* g 6, where the objective is to construct 
a sequential estimator for 9 such that Pr{|0 — 6\<e\9}>1 — 5 for any S 0. Assume that 
the sampling process consists of s stages with sample sizes ni < n2 < ■ ■ ■ < ris. For £ = 1, • • • , s, 
define an estimator 0^ for 6 in terms of samples Xi, • • • , of X. Let [L^, Ui], i = 1,2, ■ ■ ■ , s be a 
sequence of confidence intervals such that for any i, [Li, Ue] is defined in terms of Xi, • • • ,Xni and 
that the coverage probability Pr{L^ ^ 9 < Ui \ 9} can be made arbitrarily close to 1 by choosing 
^ > to be a sufficiently small number. In Theorem 2 of [13], Chen proposed the following general 
stopping rule: 

Continue sampling until Ui — e < 6i < Li + e (ov some £ £ {!,■ ■ ■ ,s}. (7) 

At the termination of the sampling process, a sequential estimator for 9 is taken as 9 ~ 9i, where I 
is the index of stage at the termination of sampling process. 

Clearly, the general stopping rule ^ can be restated as follows: 

Continue sampling until the confidence interval [Li, Ui] is included by interval [Oi — e, Oi + e\ 
for some £ G {1, • • • , s}. 

The sequence of confidence intervals are parameterized by C, for purpose of controlling the 
coverage probability Pr{|0 — 0| < e[9}. Due to the inclusion relationship [Li,Ui] C [Oi — e, 6i + e\, 
such a general methodology of using a sequence of confidence intervals to construct a stopping rule 
for controlling the coverage probability is referred to as the inclusion principle. It is asserted by 
Theorem 2 of [l3] that 

VT{[e - 9[<e[9]>l- sC5 V^gG (8) 

provided that Pr{L^ < 9 < Ui [ 9} > 1 — QS for £ = I,-- - ,s and 9 £ Q. This demonstrates 
that if the number of stages s is bounded with respective to Q, then the coverage probability 
Pr{|0 — 9[ < e [ 9} associated with the stopping rule derived from the inclusion principle can 
be controlled by Q. Actually, before explicitly proposing the inclusion principle in [T3|, Chen had 
extensively applied the inclusion principle in [6l [HI [TUl [12] to construct stopping rules for estimating 
parameters of various distributions such as binomial, Poisson, geometric, hyper geometric, normal 
distributions, etc. A more general version of the inclusion principle is proposed in |14^ Section 
2.4]. For simplicity of the stopping rule, Chen had made effort to eliminate the computation of 
confidence limits. 

In the context of estimating a binomial proportion p, the inclusion principle immediately leads 
to the following general stopping rule: 

Continue sampling until Pi — e < Li <Ui <Pi + e ioi some £ G {1, • • • , s}. (9) 

Consequently, the sequential estimator for p is taken as p according to ([2|). It should be pointed 
out that the stopping rule ([9|) had been rediscovered by Frey in Section 2, the 1st paragraph of 
[22]. The four stopping rules considered in his paper follow immediately from applying various 
confidence intervals to the general stopping rule ([9|). 

In the sequel, we will illustrate how to apply ([9|) to the derivation of Stopping Rules A, B, C, 
D introduced in Section 2.2 and other specific stopping rules. 
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3.2 Stopping Rule from Wald Intervals 

By virtue of Wald's method of interval estimation for a binomial proportion p, a sequence of 
confidence intervals [L^, Ui], £ = 1, - ■ ■ ,s for p can be constructed such that 




ne 



1 ^ 5 



and that Pr{L^ < p < C/^ | p} f« 1 - 2(5 for £ = 1, • • • ,s and p e (0, 1). Note that, for £ = 1, 
the event {pi — e < Li < Ui < pi + e] \s the same as the event — i)^ > j — ne (^^^ |- 
So, applying this sequence of confidence intervals to ([9]) results in the stopping rule "continue 
sampling until {Pe — ^) > 3 — ( j for some £ G {1, • • • , s}". Since for any ( G (0, ^), there 
exists a unique number (' G (0, ^) such that = ^2 In this stopping rule is equivalent to 
"Continue sampling until (p^ — ^ i + 2in{(S) some £ G {1, • • • , s}." This stopping rule is 
actually the same as Stopping Rule D, since — ^)^>| + 2 \n{cs) } ~ {'^'' - ^ Pf)^ ^} 

3.3 Stopping Rule from Revised Wald Intervals 

Define p£ = ^n^2a ^ ~ ' ' ' ^' where a is a positive number. Inspired by Wald's method of 
interval estimation for p, a sequence of confidence intervals [Li, Ui], £ = !,••• ,s can be constructed 
such that 

Le = pe-Z^s\ , Ue = Pe + Z^s\ 

and that PrjL^ < p < f/^ | p} ~ 1 — 2C,5 for i = 1, - ■ ■ ,s and p G (0, 1). This sequence of confidence 
intervals was applied by Frey [22] to the general stopping rule Q. As a matter of fact, such idea 



of revising Wald interval 



Xn-Zc^SV — Xn+Z^s^' ^ - 



by replacing the relative 



frequency Xn = — - involved in the confidence limits with pa = ^n+2a been proposed by 
H. Chen [3 Section 4]. 

As can be seen from Section 2, page 243, of Prey [22], applying ([9]) with the sequence of revised 
Wald intervals yields the stopping rule "Continue sampling until {p^ ~ \) — 3 + 2 \n(C,s) some 
£ G {!,••• Clearly, replacing p^ in Stopping Rule D with pp = "J^"^^^ also leads to this 

stopping rule. 

3.4 Stopping Rule from Wilson's Confidence Intervals 



Making use of the interval estimation method of Wilson [35], one can obtain a sequence of confidence 
intervals [L^, U(\, i = 1, - ■ ■ , s for p such that 



L( = max < 0, 



r> I ""C'S -7- , . / Pt(l-P<) I (Zjs^'^^ 



Ue = mm {1, , 

l + ±ci 
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and that PrjL^ <p<Ui\p}^l — 2C,6 for ^ = 1, • • • , s and p G (0, 1). It should be pointed out 
that the sequence of Wilson's confidence intervals has been applied by Prey [221 Section 2, page 
243] to the general stopping rule ([9]) for estimating a binomial proportion. 

Since a stopping rule directly involves the sequence of Wilson's confidence intervals is cumber- 
some, it is desirable to eliminate the computation of Wilson's confidence intervals in the stopping 
rule. For this purpose, we need to use the following result. 



2 ■ 

{p, - £ < < t/, < + £} = <! ( |p, - i I - e) ' > i - n, f ^ ) " [> /or £ = 1, • • • , s. 



Theorem 1 Assume that < < 1 o^nd < e < \. Then, Wilson's confidence intervals satisfy 

21 ^ - 4 - yz^^^ 

See Appendix|A]for a proof. As a consequence of Theorem[T]and the fact that for any C, G (0, j), 
there exists a unique number C,' G (0, |) such that = ^J2hi applying the sequence of Wilson's 
confidence intervals to dH) leads to the following stopping rule: Continue sampling until 



2 1 2 

, 1 e Up , , 

^1 ^i + ^McS) 



for some ^ G {1, • • • , s}. 



3.5 Stopping Rule from Clopper-Pearson Confidence Intervals 

Applying the interval estimation method of Clopper-Pearson [T7j, a sequence of confidence intervals 
[L^, U(\, i = 1, - ■ ■ , s for p can be obtained such that PrjL^ <P<Ui\p}>l — 2C,5 for £ = 1, • • • , s 
and p G (0,1), where the upper confidence limit satisfies the equation S{0, K£,n£,Ue) = (6 if 
Ki < nf, and the lower confidence limit satisfies the equation S{Ki,ne,ni, Lg) = (^6 if Ki > 0. 
The well known equation (10.8) in [HI page 173] implies that S{0,k,n,p), with < k < n, is 
decreasing with respect to p G (0,1) and that S{k,n,n,p), with < /c < n, is increasing with 
respect to p G (0, 1). It follows that 

{Pt - £ < Li} ^ {0 < Pi - e < Le} U {p^ < e} = {Pe > e, S{Ke,ni,ni,Pi - e) < CS} U {p^ < e} 

= {Pi > e, S{Ki,nt,m,Pi ~ £) < QS} U {pg < e, S{Ke,ni,ne,Pt - e) < (S} = {S{K£,ni,ne,Pi - e) < 

and 

{Pi+s>Ue}^{l>Pi, + e> Ui}U{pi, > 1 - e} = {pf, < 1 - e, S{0, K(,ni,Pf + e) < C5}U{p^ > 1 - e} 
= {p^ < 1 - e, S{0, Ki, , p^ + e) < CS} U {p^ > 1 - e, S{0, K(, ni,p, + e) < (,5} 
= {S{Q,Ki,nt,p, + e)<Q5} 

for £ = 1, • • • ,s. Consequently, 

{Pi-e<Li<Ui<p£ + £} = {S{Ke,ni,ni,p£ - e) < C^, S{0, Kg,ni,Pf^ + e) < C^} 

for i = 1, • • • ,s. This demonstrates that applying the sequence of Clopper-Pearson confidence 
intervals to the general stopping rule ([9]) gives Stopping Rule C. 

It should be pointed out that Stopping Rule C was rediscovered by J. Prey as the third stopping 
rule in Section 2, page 243 of his paper [22] . 
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3.6 Stopping Rule from Fishman's Confidence Intervals 

By the interval estimation method of Fishman [18], a sequence of confidence intervals [L^, C/^], ^ = 
1, • • • , s for p can be obtained such that 

= ifp, = 0, ^^^fl ifPf = l, 

Under the assumption that Q < C,5 < 1 and < e < ^ , by similar techniques as the proof of Theorem 
7 of [7], it can be shown that {p^-e < Li < Ue < Pe+e} = {.£{\-\\-pi\,\-\\-pf\+e) < ^^^} 
for ^ = 1, • • • , s. Therefore, applying the sequence of confidence intervals of Fishman to the general 
stopping rule ^ gives Stopping Rule A. 

It should be noted that Fishman's confidence intervals are actually derived from the Chernoff 
bounds of the tailed probabilities of the sample mean of Bernoulli random variable. Hence, Stopping 
Rule A is also referred to as the stopping rule from Chernoff bounds in this paper. 

3.7 Stopping Rule from Confidence Intervals of Chen et. al. 

Using the interval estimation method of Chen et. al. [16j . a sequence of confidence intervals 
[L£, U(], £ = 1, • • • , s for p can be obtained such that 



Lf = max < 



Up = min < 



l-2p,- Jl + ^p,(l-p,) 



„ ^ 3 V ^ 

l+81n^ 



1, P^ + 



3 l-2p. + ^l + ^P,(l-p,) 



and that Pr{L^ ^ P ^Up \ p} >! — 2C,5 for £ = 1, • • • , s and p G (0, 1). Under the assumption that 
< C5 < 1 and < e < ^ , by similar techniques as the proof of Theorem 1 of [H] , it can be shown 
that {pi-e<Li<Ui<% + e} = -\\- |e)2 > i + 2^^} for £ = 1, • • • , s. This implies 
that applying the sequence of confidence intervals of Chen et. al. to the general stopping rule Q 
leads to Stopping Rule B. 

Actually, the confidence intervals of Chen et. al. |16j are derived from Massart's inequality 
|29] on the tailed probabilities of the sample mean of Bernoulli random variable. For this reason. 
Stopping Rule B is also referred to as the stopping rule from Massart's inequality in [9l Section 
4.1.1]. 



4 Double-Parabolic Sequential Estimation 

From Sections 12. 2|, 13.21 and 13.71 it can be seen that, by introducing a new parameter p S [0, 1] and 
letting p take values | and respectively. Stopping Rules B and D can be accommodated as special 
cases of the following general stopping rule: 
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Continue the sampling process until 



pe 



1 e'^rii 
- 4 ^ 21n(C(5) 



for some £ G {1, 2, • • • , s}, where C G (0, |). 

Moreover, as can be seen from (jlOp . the stopping rule derived from applying Wilson's confidence 
intervals to Q can also be viewed as a special case of such general stopping rule with p = I. 

From the stopping condition (jlip . it can be seen that the stopping boundary is associated 
with the double-parabolic function f{x) = ^ln{(5) | — (|x — -, 
correspond to the sample mean and sample size respectively. For £ 
stopping boundaries with various p are shown by Figure [TJ 



pe) such that x and f{x) 
0.1, 5 = 0.05 and C = 1, 




Figure 1: Double-parabolic sampling 

For fixed £ and 5, the parameters p and ^ affect the shape of the stoping boundary in a way as 
follows. As p increases, the span of stopping boundary is increasing in the axis of sample mean. By 
decreasing ^, the stopping boundary can be dragged toward the direction of increasing sample size. 
Hence, the parameter p is referred to as the dilation coefficient. The parameter ( is referred to as the 
coverage tuning parameter. Since the stopping boundary consists of two parabolas, this approach 
of estimating a binomial proportion is refereed to as the double-parabolic sequential estimation 
method. 



4.1 Parametrization of the Sampling Scheme 

In this section, we shall parameterize the double-parabolic sequential sampling scheme by the 
method described in Section 2.2. From the stopping condition (jlip . the stopping rule can be 
restated as: Continue sampling until &{p£,ni) = 1 for some i S {l,-'' y^}: where the function 
^(., .) is defined by 

otherwise 
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Clearly, the function ^(., .) associated with the double-parabolic sequential sampling scheme de- 
pends on the design parameters p, C)£ and 6. Applying the function ^(., .) defined by ([12]) to ^ 
yields 



N„ 





neN: ^ 


k 


1 




mill < 












n 


" 2 



2 ln(C(5) 



for some nonnegative integer k not exceeding n 



(13) 

Since e is usually small in practical applications, we restrict e to satisfy < pe < ^. As a 
consequence of < < | and the fact that — ^| < | for any z G [0, 1], it must be true that 

2 ln(C5) ' 



(1^ 



/oe) < - /oe) for any z G [0, 1]. It follows from ^ that - pe) > 3 + 



which implies that the minimum sample size can be taken as 



2p 



p 1 In — 



(14) 



On the other hand, applying the function ^(., .) defined by (fT2|) to ([H gives 



min < n G N 



k 1 

n ^ 2 



4 21n(C(5) 



for all nonnegative integer k not exceeding n 



(15) 

Since ([z - i| - pe) > for any 2: G [0, 1], it follows from ([T5|) that \ + f^^^ < 0, which imphes 
that maximum sample size can be taken as 

A^ rnax 



— K In — 

2e2 (5 



(16) 



Therefore, the sample sizes ni, • • • , can be chosen as functions of p, C, £ and 6 which satisfy the 
following constraint: 

Nmin <ni<---< Us-l < Amax < (17) 

In particular, if the number of stages s is given and the group sizes are expected to be approximately 
equal, then the sample sizes, ni, • • • , Ug, for all stages can be obtained by substituting Amin defined 
by (fH|) and Amax defined by ([16]) into ([5]). For example, if the values of design parameters are 
e = 0.05, 6 = 0.05, p = |, C = 2.6759 and s = 7, then the sample sizes of this sampling scheme 
are calculated as 

m = 59, 712 = 116, = 173, n4 = 231, ng = 288, = 345, rij = 403. 
The stopping rule is completely determined by substituting the values of design parameters into 



4.2 Uniform Controllability of Coverage Probability 

Clearly, for pre-specified e, 6 and p, the coverage probability Pr{|p — p\ < £ \ p} depends on the 
parameter the number of stages s, and the sample sizes ni, - ■ ■ , n^. As illustrated in Section [4. H 
the number of stages s and the sample sizes ni, • • • ,ns can be defined as functions of C G (0, ^). 
That is, the stopping rule can be parameterized by Accordingly, for any p G (0, 1), the coverage 
probability Pr{|p— p| < e | p} becomes a function of C- The following theorem shows that it suffices 
to choose C G (0) ^) small enough to guarantee the pre-specified confidence level. 
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Theorem 2 Let e, 5 G (0, 1) and p G (0, 1] be fixed. Assume that the number of stages s and the 
sample sizes ni, - ■ ■ ,ns are functions of C, G (0, |) such that the constraint ( (j7| j is satisfied. Then, 
Pr{|p — p\ < e\p} is no less than 1 — 8 for any p £ (0, 1) provided that 

In I + In [1 - exp(-2e2)] \ 
4ep(l - pe) J ■ 

See Appendix [B] for a proof. For Theorem [2] to be valid, the choice of sample sizes is very 
flexible. Specially, the sample sizes can be arithmetic or geometric progressions or any others, 
as long as the constraint (I17p is satisfied. It can be seen that for the coverage probability to be 
uniformly controllable, the dilation coefficient p must be greater than 0. Theorem [2] asserts that 
there exists C > such that the coverage probability is no less than 1 — 6, regardless of the associated 
binomial proportion p. For the purpose of reducing sampling cost, we want to have a value of C as 
large as possible such that the pre-specified confidence level is guaranteed for any p G (0, 1). This 
can be accomplished by the technical components introduced in Sections 2.1, 2.3, 2.4 and Section 
2.5. Clearly, for every value of p, we can obtain a corresponding value of C (as large as possible) 
to ensure the desired confidence level. However, the performance of resultant stopping rules are 
different. Therefore, we can try a number of values of p and pick the best resultant stopping rule 
for practical use. 



< C < T exp 





4.3 Asymptotic Optimality of Sampling Schemes 



Now we shall provide an important reason why we propose the sampling scheme of that structure by 
showing its asymptotic optimality. Since the performance of a group sampling scheme will be close 
to its fully sequential counterpart, we investigate the optimality of the fully sequential sampling 
scheme. In this scenario, the sample sizes ni,n2, • • • ,ns are consecutive integers such that 



2p 



In 



1 



ni < n2 < 



< Us^i < Us 



— t: In — 

2e2 



(18) 



The fully sequential sampling scheme can be viewed as a special case of a group sampling scheme 
of s = — ni + 1 stages and group size 1. Clearly, if 5, Q and p are fixed, the sampling scheme 
is dependent only on e. Hence, for any p G (0, 1), if we allow e to vary in (0, 1), then the coverage 
probability Pr{|p — p\ < £ \ p} and the average sample number E[n] are functions of e. We are 
interested in knowing the asymptotic behavior of these functions as e — >• 0, since e is usually small 
in practical situations. The following theorem provides us the desired insights. 

Theorem 3 Assume that 5 G (0,1), Q G (0,^) and p G (0,1] are fixed. Define N{p, e, 5, C) = 
for p G (0, 1) and e G (0, 1). Then, 



2p(l-p)lnJj 



Pr <^ lim 



n 



lim Prjlp 



lim 



oNip,e,6, C) 
p\<e\p} 

Efnl 



= 1 |p =1 



2$ 



N{p,eJ,C) 



= 1 



(19) 
(20) 



for any p G (0, 1). 
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See Appendix [Cl for a proof. From (fT9]) . it can be seen that lime_!.o Pr{|p — p\ <e\p} = \ — 5 
for any p S (0,1) if C = i 6xp(— ^^^^g)- Such value can be taken as an initial value for the 
coverage tuning parameter C,. In addition to provide guidance on the coverage tuning techniques, 
Theorem [3] also establishes the optimality of the sampling scheme. To see this, let ^{p, £, d) 
denote the minimum sample size n required for a fixed-sample-size procedure to guarantee that 
Pr{|X„ — p| < e I p} > 1 — (5 for any p G (0, 1), where X„ = — -. It is well known that from 
the central limit theorem, 

J^{p,e,6) 



lim 



1. 
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Making use of (I20D, (HU and letting C = | exp{-^Z]^^), we have hm^^o f£eM) " ^ ^ ^ 

and 6 E (0, 1), which implies the asymptotic optimality of the double-parabolic sampling scheme. 

By virtue of (j20p . an approximate formula for computing the average sample number is given as 



E[nl 



N{p, 8,6,0 



2p{l-p)ln^ 



(22) 



S/2 



which is a 



for p e (0,1) and e e (0,1). From (f2T]l . one obtains ^{p,e,6) ^ p{l — p) 
well-known result in statistics. In situations that no information of p is available, one usually uses 



z. 



-^normal 



def 



■-5/2 



(23) 



as the sample size for estimating the binomial proportion p with prescribed margin of error e and 
confidence level 1 — 5. Since the sample size formula (j23p can lead to under-coverage, researchers 
in many areas are willing to use a more conservative but rigorous sample size formula 

2 ■ 



def 



ch 



In- 



2^2 



(24) 



which is derived from the Chernoff-Hoeffding bound [21 [25] . Comparing p2l) and ()24p , one can see 
that under the premise of guaranteeing the prescribed confidence level 1 — (5, the double-parabolic 
sampling scheme can lead to a substantial reduction of sample number when the unknown binomial 
proportion p is close to or 1. 



4.4 Bounds on Distribution and Expectation of Sample Number 

We shall derive analytic bounds for the cumulative distribution function and expectation of the 
sample number n associated with the double-parabolic sampling scheme. In this direction, we have 
obtained the following results. 

Theorem 4 Let p G (0, Define ae = ^ — pe — \J \ + 2K(^ f^'^ ^ ~ ' ^- denote the 

index of stage such that a,— i < p < a^. Then, Pr{n > rii \ p} < exp{ni^{a£,p)) for t < i < s. 
Moreover, E[n] < Ur + X]|=t("'^+i ~ ni) exp{ni^{ai,p)). 

See Appendix iDl for a proof. By the symmetry of the double-parabolic sampling scheme, similar 
analytic bounds for the distribution and expectation of the sample number can be derived for the 
case that p G i^A)- 
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5 Comparison of Computational Methods 



In this section, we shall compare various computational methods. First, we will illustrate why a 
frequently-used method of evaluating the coverage probability based on gridding the parameter 
space is not rigorous and is less efficient as compared to the Adapted B&B Algorithm. Second, we 
will introduce the Adaptive Maximum Checking Algorithm of [9] which has better computational 
efficiency as compared to the Adapted B&B Algorithm. Third, we will explain that it is more 
advantageous in terms of numerical accuracy to work with the complementary coverage probability 
as compared to direct evaluation of the coverage probability. Finally, we will compare the compu- 
tational methods of Chen [6l [H \T0\ \T7[ [13] and Frey [22] for the design of sequential procedures for 
estimating a binomial proportion. 

5.1 Verifying Coverage Guarantee without Gridding Parameter Space 

For purpose of constructing a sampling scheme so that the prescribed confidence level 1 — 5 is 
guaranteed, an essential task is to determine whether the coverage probability Pr{|p — p\ < £ \ p} 
associated with a given stopping rule is no less than 1 — 5. In other words, it is necessary to compare 
the infimum of coverage probability with 1 — 6. To accomplish such a task of checking coverage 
guarantee, a natural method is to evaluate the infimum of coverage probability as follows: 

(i) : Choose m grid points pi, • • • ,Pm from parameter space (0, 1). 

(ii) : Compute cj = Pr{|p — p\ < e \ pj} for j = 1, • • • , m. 

(iii) : Take min{ci, • • • , c^} as infpg(o,i) Pr{|p - p\ < e\p}. 

This method can be easily mistaken as an exact approach and has been frequently used for 
evaluating coverage probabilities in many problem areas. 

It is not hard to show that if the sample size n of a sequential procedure has a support 
then the coverage probability Pr{|p — p| < e | p} is discontinuous at p G ^ n (0, 1), where 
^ = ziz £ : k is a nonnegative integer no greater than n G J/'} . The set ^ typically has a large 
number of parameter values. Due to the discontinuity of the coverage probability as a function of 
p, the coverage probabilities can differ significantly for two parameter values which are extremely 
close. This implies that an intolerable error can be introduced by taking the minimum of coverage 
probabilities of a finite number of parameter values as the infimum of coverage probability on the 
whole parameter space. So, if one simply uses the minimum of the coverage probabilities of a 
finite number of parameter values as the infimum of coverage probability to check the coverage 
guarantee, the sequential estimator p of the resultant stopping rule will fail to guarantee the 
prescribed confidence level. 

In addition to the lack of rigorousness, another drawback of checking coverage guarantee based 
on the method of gridding parameter space is its low efficiency. A critical issue is on the choice of 
the number, m, of grid points. If the number m is too small, the induced error can be substantial. 
On the other hand, choosing a large number for m results in high computational complexity. 

In contrast to the method based on gridding parameter space, the Adapted B&B Algorithm 
is a rigorous approach for checking coverage guarantee as a consequence of the mechanism for 
comparing the bounds of coverage probability with the prescribed confidence level. The algorithm 
is also efficient due to the mechanism of pruning branches. 
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5.2 Adaptive Maximum Checking Algorithm 

As illustrated in Section [21 the techniques developed in [Gj [H [lOl [12l [13] are sufficient to provide 
exact solutions for a wide range of sequential estimation problems. However, one of the four 
components, the Adapted B&B Algorithm, requires computing both the lower and upper bounds 
of the complementary coverage probability. To further reduce the computational complexity, it is 
desirable to have a checking algorithm which needs only one of the lower and upper bounds. For 
this purpose, Chen had developed the Adaptive Maximum Checking Algorithm (AMCA) in [9l 
Section 3.3] and |14t Section 2.7]. In the following introduction of the AMCA, we shall follow the 
description of [9]. The AMCA can be applied to a wide class of computational problems dependent 
on the following critical subroutine: 

Determine whether a function C{6) is smaller than a prescribed number 6 for every value of 6 
contained in interval [6, 9] . 

Specially, for checking the coverage guarantee in the context of estimating a binomial proportion, 
the parameter 6 is the binomial proportion p and the function C{9) is actually the complementary 
coverage probability. In many situations, it is impossible or very difficult to evaluate C{9) for every 
value of 9 in interval [9, 9] , since the interval may contain infinitely many or an extremely large 
number of values. Similar to the Adapted B&B Algorithm, the purpose of AMCA is to reduce the 
computational complexity associated with the problem of determining whether the maximum of 
C{9) over [9,9] is less than 6. The only assumption required for AMCA is that, for any interval 
[a,b] C [9,9], it is possible to compute an upper bound C{a,b) such that C{9) < C{a,h) for any 
9 G [a, b] and that the upper bound converges to C{9) as the interval width b — a tends to 0. The 
backward AMCA proceeds as follows: 

V Choose initial step size d > i]. 

V Let F ^ 0, r ^ and 6 ^ ^. 

V While F = T = 0, do the following: 

o Let st ^ and 1^2; 

o While st = 0, do the following: 

★ Let £ ^ £ - 1 and d ^ d2^. 

* li b - d> 9, then let a ^ 6 - d and T <— 0. 

Otherwise, let a ^ ^ and T ^ 1. 
■k If C{a, b) < 5, then let st 1 and b ^ a. 
•k If d < T], then let st 1 and F ^ 1. 

V Return F. 

The output of the backward AMCA is a binary variable F such that "F = 0" means ^^C{9) < 6" 
and "F = 1" means ^^C{9) > 5". An intermediate variable T is introduced in the description of 
AMCA such that "T = 1" means that the left endpoint of the interval is reached. The backward 
AMCA starts from the right endpoint of the interval (i.e., b = 9) and attempts to find an interval 
[a, b] such that C(a, b) < 5. If such an interval is available, then, attempt to go backward to find 
the next consecutive interval with twice width. If doubling the interval width fails to guarantee 
C(o, b) < 6, then try to repeatedly cut the interval width in half to ensure that C(a, b) < 6. If the 
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interval width becomes smaller than a prescribed tolerance rj, then AMCA declares that "F = 1" . 
For our relevant statistical problems, if C{6) > 8 for some 6 G [§_,6], it is sure that "F = 1" will 
be declared. On the other hand, it is possible that "F = 1" is declared even though C{9) < 6 
for any 9 G [9,0]. However, such situation can be made extremely rare and immaterial if we 
choose ?7 to be a very small number. Moreover, this will only introduce negligible conservativeness 
in the evaluation of C{9) if r] is chosen to be sufficiently small (e.g., rj = 10~^^). Clearly, the 
backward AMCA can be easily modified as forward AMCA. Moreover, the AMCA can also be 
easily modified as Adaptive Minimum Checking Algorithm (forward and backward). For checking 
the maximum of complementary coverage probability Pr{|p — p\ > £ \ p}, one can use the AMCA 
with C{p) = Pr{|p — p\ > £ \ p} over interval [0, We would like to point out that, in contrast 
to the Adapted B&B Algorithm, it seems difficult to generalize the AMCA to problems involving 
multidimensional parameter spaces. 

5.3 Working with Complementary Coverage Probability 

We would like to point out that, instead of evaluating the coverage probability as in [22], it is better 
to evaluate the complementary coverage probability for purpose of reducing numerical error. The 
advantage of working on the complementary coverage probability can be explained as follows: Note 
that, in many cases, the coverage probability is very close to 1 and the complementary coverage 
probability is very close to 0. Since the absolute precision for computing a number close to 1 is 
much lower than the absolute precision for computing a number close to 0, the method of directly 
evaluating the coverage probability will lead to intolerable numerical error for problems involving 
small 6. As an example, consider a situation that the complementary coverage probability is in 
the order of 10~^. Direct computation of the coverage probability can easily lead to an absolute 
error of the order of 10~^. However, the absolute error of computing the complementary coverage 
probability can be readily controlled at the order of 10~^. 

5.4 Comparison of Approaches of Chen and J. Prey 

As mentioned in the introduction, J. Frey published a paper [22] in The American Statistician 
(TAS) on the sequential estimation of a binomial proportion with prescribed margin of error and 
confidence level. The approaches of Chen and Frey are based on the same strategy as follows: First, 
construct a family of stopping rules parameterized by 7 (and possibly other design parameters) so 
that the associated coverage probability Pr{|p — p\ < e \ p} can be controlled by parameter 7 
in the sense that the coverage probability can be made arbitrarily close to 1 by increasing 7. 
Second, adaptively and rigorously check the coverage guarantee by virtue of bounds of coverage 
probabilities. Third, apply a bisection search method to determine the parameter 7 so that the 
coverage probability is no less than the prescribed confidence level 1 — 5 for any p € (0, 1). 

For the purpose of controlling the coverage probability, Frey [22] applied the inclusion principle 
previously proposed in [131 Section 3] and used in [6l [H [lOl [12]. As illustrated in Section 3, the 
central idea of inclusion principle is to use a sequence of confidence intervals to construct stopping 
rules so that the sampling process is continued until a confidence interval is included by an interval 
defined in terms of the estimator and margin of error. Due to the inclusion relationship, the 
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associated coverage probability can be controlled by the confidence coefficients of the sequence of 
confidence intervals. The critical value 7 used by Frey plays the same role for controlling coverage 
probabilities as that of the coverage tuning parameter ( used by Chen. Frey [22] stated stopping 
rules in terms of confidence limits. This way of expressing stopping rules is straightforward and 
insightful, since one can readily seen the principle behind the construction. For convenience of 
practical use, Chen proposed to eliminate the necessity of computing confidence limits. 

Frey's method for checking coverage guarantee differs from the Adapted B&B Algorithm, but 
coincides with other techniques of Chen [9]. On September 18, 2011, in response to an inquiry on 
the coincidence of the research results, Frey simultaneously emailed Xinjia Chen (the coauthor of 
the present paper) and TAS Editor John Stufken all pre-final revisions of his manuscript for the 
paper [22]. In his original manuscript submitted to TAS in July 2009, Frey's method was to "simply 
approximate CP{'y) by taking the minimum over the grid of values p = 1/2001, 2000/2001." In 
the first revision of his manuscript submitted to TAS in November 2009, Frey's method was to "ap- 
proximate CP^j) by taking the minimum of T{p; 7) over the grid of values p = 1 /2001, 2000/2001 
and the set of values of the form p = c it e, where c G C and e = 10"^*^." In Frey's notational 
system, 7 is the critical value which plays the same role as that of the coverage tuning parame- 
ter ( in the present paper, T{p; 7) is the coverage probability, CP{'y) is the infimum of coverage 
probability for p £ (0, 1), and C = {p ± e : p is a possible value of p} n (0, 1). From the original 
and the first revision of his manuscript submitted to TAS before April 2010, it can be seen that 
Frey's method of checking coverage guarantee was dependent on taking the minimum of coverage 
probabilities for a finite number of gridding points of p S (0, 1) as the infimum coverage probability 
for p S (0, 1). As can be seen from Section 5.1 of the present paper, such method lacks rigorousness 
and efficiency. In the second revision of his manuscript submitted to TAS in April 2010, for the 
purpose of checking coverage guarantee, Frey replaced the method of gridding parameter space 
with an interval bounding technique and proposed a checking algorithm which is essentially the 
same as the AMCA precedentially established by Chen O Section 3.3] in November 2009. 

Similar to the AMCA of [U Section 3.3], the algorithm of Frey [221 Appendix] for checking 
coverage guarantee adaptively scans the parameter space based on interval bounding. The adaptive 
method used by Frey for updating step size is essentially the same as that of the AMCA. Ignoring 
the number 0.01 in Frey's expression "ej = minjO.Ol, 2(pj_i — pj_2)}", which has very little impact 
on the computational efficiency, Frey's step size can be identified as the adaptive step size d in 
the AMCA. The operation associated with "ej = minjO.Ol, 2(pj_i — pj_2)}" bas a similar function 
as that of the command "Let st and i ^ 2" in the outer loop of the AMCA. The operation 
associated with Frey's expression "pi_i + ei/2-', j > 0" is equivalent to that of the command "Let 
i ^ i — 1 and d <— d2^" in the inner loop of the AMCA. Frey proposed to declare a failure of 
coverage guarantee if "the distance from pi-i to the candidate value for pi falls below 10~^^". 
The number "io~^^" actually plays the same role as "ry" in the AMCA, where "77 = 10~^^" is 
recommended by [9]. 
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6 Numerical Results 



In this section, we shall illustrate the proposed double-parabolic sampling scheme through exam- 
ples. As demonstrated in Section 2.2 and Section 4, the double-parabolic sampling scheme can 
be parameterized by the dilation coefficient p and the coverage tuning parameter Q. Hence, the 
performance of the resultant stopping rule can be optimized with respect to /3 G (0, 1] and C, by 
choosing various values of p from interval (0, 1] and determining the corresponding values of C, by 
the computational techniques introduced in Section 2 to guarantee the desired confidence interval. 

6.1 Asymptotic Analysis May Be Inadequate 

For fully sequential cases, we have evaluated the double-parabolic sampling scheme with e = 
0.1, 5 = 0.05, p = 0.1 and C, = | exp iz|^2) ~ 2.93. The stopping boundary is displayed in 
the left side of Figure [2J The function of coverage probability with respect to the binomial pro- 
portion is shown in the right side of Figure [2l which indicates that the coverage probabilities are 
generally substantially lower than the prescribed confidence level 1 — 5 = 0.05. By considering 
e = 0.1 as a small number and applying the asymptotic theory, the coverage probability associated 
with the sampling scheme is expected to be close to 0.95. This numerical example demonstrates 
that although the asymptotic method is insightful and involves virtually no computation, it may 
not be adequate. 

In general, the main drawback of an asymptotic method is that there is no guarantee of coverage 
probability. Although an asymptotical method asserts that if the margin of error e tends to 0, the 
coverage probability will tend to the pre-specified confidence level 1 — 5, it is difficult to determine 
how small the margin of error e is sufficient for the asymptotic method to be applicable. Note 
that e — >• implies the average sample size tends to oo. However, in reality, the sample sizes must 
be finite. Consequently, an asymptotic method inevitably introduces unknown statistical error. 
Since an asymptotic method does not necessarily guarantee the prescribed confidence level, it is 
not fair to compare its associated sample size with that of an exact method, which guarantees the 
pre-specified confidence level. 

This example also indicates that, due to the discrete nature of the problem, the coverage 
probability is a discontinuous and erratic function of p, which implies that Monte Carlo simulation 
is not suitable for evaluating the coverage performance. 
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Figure 2: Double-parabolic sampling with e = 0.1, 6 = 0.05, p = jq and C = 2.93 
6.2 Parametric Values of Fully Sequential Schemes 

For fully sequential cases, to allow direct application of our double-parabolic sequential method, we 
have obtained values of coverage tuning parameter C, which guarantee the prescribed confidence 
levels, for double-parabolic sampling schemes with p = j and various combinations of (e, 6) as 
shown in Table [TJ We used the computational techniques introduced in Section 2 to obtain this 
table. 



Table 1: Coverage Tuning Parameter 
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2.1725 


0.01 


0.05 


2.5592 


0.01 
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3.4461 



To illustrate the use of Table [H suppose that one wants a fully sequential sampling procedure 
to ensure that Pr{|p — p\ < 0.1 \ p} > 0.95 for any p € (0, 1). This means that one can choose 
£ = 0.1, 6 = 0.05 and the range of sample size is given by (llSp . From Table [U it can be seen 
that the value of C corresponding to e = 0.1, 6 = 0.05 is 2.417 A. Consequently, the stopping rule 
is completely determined by substituting the values of design parameters e = 0.1, 5 = 0.05, p = 
\, Q = 2.4174 into its definition. The stopping boundary of this sampling scheme is displayed in the 
left side of Figure O The function of coverage probability with respect to the binomial proportion 
is shown in the right side of Figure [3j 
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Figure 3: Double-parabolic sampling with e = 0.1, 6 = 0.05, /> = | and C = 2.4174 
6.3 Parametric Values of Group Sequential Schemes 

In many situations, especially in clinical trials, it is desirable to use group sequential sampling 
schemes. In Tables [2] and [3l assuming that sample sizes satisfy ^ for the purpose of having 
approximately equal group sizes, we have obtained parameters for concrete schemes by the compu- 
tational techniques introduced in Section 2. 

For dilation coefficient p = j and confidence parameter 6 = 0.05, we have obtained values 
of coverage tuning parameter which guarantee the prescribed confidence level 0.95, for double- 
parabolic sampling schemes, with the number of stages s ranging from 3 to 10, as shown in Table 

m 

For dilation coefficient p = j and confidence parameter 6 = 0.01, we have obtained values 
of coverage tuning parameter ^, which guarantee the prescribed confidence level 0.99, for double- 
parabolic sampling schemes, with the number of stages s ranging from 3 to 10, as shown in Table 

El 



Table 2: Coverage Tuning Parameter 
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Table 3: Coverage Tuning Parameter 
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Figure 4: Double-parabolic sampling with e = 5 = 0.01, s = 10, p = \ and C, = 3.5753 

To illustrate the use of these tables, suppose that one wants a ten-stage sampling procedure of 
approximately equal group sizes to ensure that Pr{|p — p\ < 0.01 | p} > 0.99 for any p G (0, 1). 
This means that one can choose e = 5 = 0.01, s = 10 and sample sizes satisfying ([5]). To obtain 
appropriate parameter values for the sampling procedure, one can look at Table [3] to find the 
coverage tuning parameter Q corresponding to e = 0.01 and s = 10. From Table 3, it can be 
seen that Q can be taken as 3.5753. Consequently, the stopping rule is completely determined by 
substituting the values of design parameters e = 0.01, 5 = 0.01, /> = |, C = 3.5753, s = 10 into 
its definition and equation The stopping boundary of this sampling scheme and the function 
of coverage probability with respect to the binomial proportion are displayed, respectively, in the 
left and right sides of Figure [H 

6.4 Comparison of Sampling Schemes 

We have conducted numerical experiments to investigate the impact of dilation coefficient p on the 
performance of our double-parabolic sampling schemes. Our computational experiences indicate 
that the dilation coefficient p = jis frequently a good choice in terms of average sample number and 
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coverage probability. For example, consider the case that the margin of error is given as e = 0.1 
and the prescribed confidence level is 1 — 5 with 5 = 0.05. For the double-parabolic sampling 
scheme with the dilation coefficient p chosen as |, | and 1, we have determined that, to ensure the 
prescribed confidence level 1 — 5 = 0.95, it suffices to set the coverage tuning parameter as 2.1, 2.4 
and 2.4, respectively. The average sample numbers of these sampling schemes and the coverage 
probabilities as functions of the binomial proportion are shown, respectively, in the left and right 
sides of Figure El From Figure [5l it can be seen that a double-parabolic sampling scheme with 
dilation coefficient p = | has better performance in terms of average sample number and coverage 
probability as compared to that of the double-parabolic sampling scheme with smaller or larger 
values of dilation coefficient. 




Binomial proportion Binomial proportion 



Figure 5: Double-parabolic sampling with various dilation coefficients 

We have investigated the impact of confidence intervals on the performance of fully sequential 
sampling schemes constructed from the inclusion principle. We have observed that the stopping 
rule derived from Clopper-Pearson intervals generally outperforms the stopping rules derived from 
other types of confidence intervals. However, via appropriate choice of the dilation coefficient, the 
double-parabolic sampling scheme can perform uniformly better than the stopping rule derived from 
Clopper-Pearson intervals. To illustrate, consider the case that e = 0.1 and 6 = 0.05. For stopping 
rules derived from Clopper-Pearson intervals, Fishman's intervals, Wilson's intervals, and revised 
Wald intervals with a = 4, we have determined that to guarantee the prescribed confidence level 
1 — 6 = 0.95, it suffices to set the coverage tuning parameter C as 0.5, 1, 2.4 and 0.37, respectively. 
For the stopping rule derived from Wald intervals, we have determined C = 0.77 to ensure the 



confidence level, under the condition that the minimum sample size is taken as 



-In A 



Recall 



that for the double-parabolic sampling scheme with p = |, we have obtained ( = 2.4 for purpose 
of guaranteeing the confidence level. The average sample numbers of these sampling schemes are 
shown in Figure [H From these plots, it can be seen that as compared to the stopping rule derived 
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from Clopper- Pearson intervals, the stopping rule derived from the revised Wald intervals performs 
better in the region of p close to or 1, but performs worse in the region of p in the middle of 
(0,1). The performance of stopping rules from Fishman's intervals (i.e., from Chernoff bound) 
and Wald intervals are obviously inferior as compared to that of the stopping rule derived from 
Clopper-Pearson intervals. It can be observed that the double-parabolic sampling scheme uniformly 
outperforms the stopping rule derived from Clopper-Pearson intervals. 
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Figure 6: Comparison of average sample numbers 
6.5 Estimation with High Confidence Level 

In some situations, we need to estimate a binomial proportion with a high confidence level. For 
example, one might want to construct a sampling scheme such that, for e = 0.05 and 6 = 10"^", 
the resultant sequential estimator p satisfies Pr{|p — p\ < e \ p} > 1 — 5 ior any p £ (0, 1). By 
working with the complementary coverage probability, we determined that it sufhces to let the 
dilation coefficient = f and the coverage tuning parameter = 7.65. The stopping boundary 
and the function of coverage probability with respect to the binomial proportion are displayed, 
respectively, in the left and right sides of Figure [71 As addressed in Section 15. 3[ it should be noted 
that it is impossible to obtain such a sampling scheme without working with the complementary 
coverage probability. 
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Figure 7: Double-parabolic sampling with e = 0.05, (5 = 10 p = \ and Q = 7.65 

7 Illustrative Examples for Clinical Trials 

In this section, we shall illustrate the applications of our double-parabolic group sequential estima- 
tion method in clinical trials. 

An example of our double-parabolic sampling scheme can be illustrated as follows. Assume that 
E = 6 = 0.05 is given and that the sampling procedure is expected to have 7 stages with sample sizes 
satisfying ([5]). Choosing p = |, we have determined that it suffices to take C, = 2.6759 to guarantee 
that the coverage probability is no less than \ — 5 = 0.95 for all p G (0, 1). Accordingly, the sample 
sizes of this sampling scheme are calculated as 59,116,173,231,288,345 and 403. This sampling 
scheme, with a sample path, is shown in the left side of Figure [8l In this case, the stopping rule 
can be equivalently described by virtue of Figure [8] as: Continue sampling until (p^, n^) hit a green 
line at some stage. The coverage probability is shown in the right side of Figure [8l 

To apply this estimation method in a clinical trial for estimating the proportion p of a binomial 
response with margin of error 0.05 and confidence level 95%, we can have seven groups of patients 
with group sizes 59, 57, 57, 58, 57, 57 and 58. In the first stage, we conduct experiment with the 
59 patients of the first group. We observe the relative frequency of response and record it as Pi- 
Suppose there are 12 patients having positive responses, then the relative frequency at the first 
stage is = -gl = 0.2034. With the values of (Pi,ni) = (0.2034,59), we check if the stopping rule 
is satisfied. This is equivalent to see if the point (p]^,ni) hit a green line at the first stage. For 
such value of (pi,ni), it can be seen that the stopping condition is not fulfilled. So, we need to 
conduct the second stage of experiment with the 57 patients of the second group. We observe the 
response of these 57 patients. Suppose we observe that 5 patients among this group have positive 
responses. Then, we add 5 with 12, the number of positive responses before the second stage, to 
obtain 17 positive responses among 77-2 = 59 + 57 = 116 patients. So, at the second stage, we 
get the relative frequency P2 = = 0.1466. Since the stopping rule is not satisfied with the 
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values of (P2)^2) = (0.1466, 116), we need to conduct the third stage of experiment with the 57 
patients of the third group. Suppose we observe that 14 patients among this group have positive 
responses. Then, we add 14 with 17, the number of positive responses before the third stage, to 
get 31 positive responses among = 59 + 57 + 57 = 173 patients. So, at the third stage, we 
get the relative frequency = = 0.1792. Since the stopping rule is not satisfied with the 
values of (^3,^.3) = (0.1792,173), we need to conduct the fourth stage of experiment with the 58 
patients of the fourth group. Suppose we observe that 15 patients among this group have positive 
responses. Then, we add 15 with 31, the number of positive responses before the fourth stage, to 
get 46 positive responses among 77,4 = 59 + 57 + 57 + 58 = 231 patients. So, at the fourth stage, 
we get the relative frequency P4 = ^ = 0.1991. Since the stopping rule is not satisfied with the 
values of (^4,^4) = (0.1991,231), we need to conduct the fifth stage of experiment with the 57 
patients of the fifth group. Suppose we observe that 6 patients among this group have positive 
responses. Then, we add 6 with 46, the number of positive responses before the fifth stage, to get 
52 positive responses among ns = 59 + 57 + 57 + 58 + 57 = 288 patients. So, at the fifth stage, we 
get the relative frequency = ^ = 0.1806. It can be seen that the stopping rule is satisfied with 
the values of (^5,725) = (0.1806,288). Therefore, we can terminate the sampling experiment and 
take p = ^ = 0.1806 as an estimate of the proportion of the whole population having positive 
responses. With a 95% confidence level, one can believe that the difference between the true value 
of p and its estimate p = 0.1806 is less than 0.05. 
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Figure 8: Double-parabolic sampling with e = 6 = 0.05, s = 7, p = j and ( = 2.6759 

In this experiment, we only use 288 samples to obtain the estimate for p. Except the round- 
off error, there is no other source of error for reporting statistical accuracy, since no asymptotic 
approximation is involved. As compared to fixed-sample-size procedure, we achieved a substantial 
save of samples. To see this, one can check that using the rigorous formula ([24ll gives a sample 
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size 738, which is overly conservative. From the classical approximate formula (j22p . the sample 
size is determined as 385, which has been known to be insufficient to guarantee the prescribed 
confidence level 95%. The exact method of [15] shows that at least 391 samples are needed. As 
compared to the best fixed sample size obtained by the method of [15], the reduction of sample 
sizes resulted from our double-parabolic sampling scheme is 391 — 288 = 103. It can be seen that 
the fixed-sample-size procedure wastes ^§1 = 35.76% samples as compared to our group sequential 
method, which is also an exact method. This percentage may not be serious if it were a save of 
number of simulation runs. However, as the number count is for patients, the reduction of samples 
is important for ethical and economical reasons. Using our group sequential method, the worst-case 
sample size is equal to 403, which is only 12 more than the minimum sample size of fixed-sample 
procedure. However, a lot of samples can be saved in the average case. 

As e or (5 become smaller, the reduction of samples is more significant. For example, let e = 0.02 
and 5 = 0.05, we have a double-parabolic sample scheme with 10 stages. The sampling scheme, 
with a sample path, is shown in the left side of Figure [9j The coverage probability is shown in the 
right side of Figure [9l 
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Figure 9: Double-parabolic sampling with e = 0.02, 5 = 0.05, s = 10, p = j and ( = 2.6725 



8 Conclusion 



In this paper, we have reviewed recent development of group sequential estimation methods for 
a binomial proportion. We have illustrated the inclusion principle and its applications to various 
stopping rules. We have introduced computational techniques in the literature, which suffice for de- 
termining parameters of stopping rules to guarantee desired confidence levels. Moreover, we have 
proposed a new family of sampling schemes with stopping boundary of double-parabolic shape, 
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which are parameterized by the coverage tuning parameter and the dilation coefficient. These pa- 
rameters can be determined by the exact computational techniques to reduce the sampling cost, 
while ensuring prescribed confidence levels. The new family of sampling schemes are extremely sim- 
ple in structure and asymptotically optimal as the margin of error tends to 0. We have established 
analytic bounds for the distribution and expectation of the sample number at the termination of 
the sampling process. We have obtained parameter values via the exact computational techniques 
for the proposed sampling schemes such that the confidence levels are guaranteed and that the 
sampling schemes are generally more efficient as compared to existing ones. 



A Proof of Theorem [T] 



Consider function g{x,z) = for x G (0,1) and z G [0,1]. It can be checked that 

(x — z)[z{l — x) + x(l — z)][x(l — which shows that for any fixed z G [0, 1], —g{x,z) is a 

unimodal function of j; G (0, 1), with a maximum attained at x = z. By such a property of g{x, z) 
and the definition of Wilson's confidence intervals, we have 
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for ^ = 1, • • • , s, where we have used the fact that {p^ > e} C {Li > 0}, {p^ < I — e} CI {JJe < 1} 
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for ^ = 1, • • • , s. This completes the proof of the theorem. 
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B Proof of Theorem [2] 

By the assumption that ng > In we have j + — ^ ^^^'^ consequently, Pr{(|p^ — i| — pe)^ > 

3 + 2in(c'i5) J" ~ ^' follows from the definition of the sampling scheme that the sampling process 
must stop at or before the s-th stage. In other words, Pr{Z < s} = 1. This allows one to write 

s s 

Pr{|p — p\ > e \ p} = ^ Pr{|p — p\ > e, I = i \ p} = ^ Prjlp^ — p\ > £, I = \ p} 

s 

< J2^t{\p, - p\ > e \ p} (25) 



for p G (0, 1). By virtue of the well-known Chernoff-Hoeffding bound [2l[25], we have 

Pr{|p^ -p\>e\p}<2 exp(-2n^e2) (26) 



for £ = 1, • • • , s. Making use of (I25p . (I26p and the fact that ni > 2p(^ — p) In ^ as can be seen 
from (1181). we have 



Pic{\p — p\ > e \ p} < 2'y^exp(— 2n^g^) < 2 ^ exp(— 2me^ 



< 



2exp(-2nie2) 
1 - exp(-2e2) 

2 exp (^-2e^ x 2p(i - p) In ^ j 2 exp (4ep(l - pe) ln(C5)) 



m=ni 



1 - exp(-2e2) 



1 - exp(-2e2) 



for any p G (0, 1). Therefore, to guarantee that Pr{|p— p| < £ \ p} > 1 — 5 for any p G (0, 1), it is suffi- 
cient to choose C such that 2 exp (4e/3(l — pe) ln{(6)) < 6[l—exp{—2e'^)]. This inequality can be writ- 
ten as 4ep(l - pe) ln{CS) < In | + In [l - exp(-2e2)] or equivalently, C < | exp (^llil±i^i^^2g_£f!l] 
The proof of the theorem is thus completed. 



1 I p} = 1 for any p G (0,1). Clearly, the 



C Proof of Theorem [3] 

First, we need to show that Pr{lime_>.o ^y^^" ^ 
sample number n is a random number dependent on e. Note that for any w G fi, the sequences 
{-'^nH(^)}eG(o,i) and {Xn{uj)~i{^)}ee{o,i) are subsets of {Xm(w)}^=i- By the strong law of large 
numbers, for almost every w G Jl, the sequence {Xmi^)}^^^ converges to p. Since every subse- 
quence of a convergent sequence must converge, it follows that the sequences {^n(i^)('^)}ee(o,i) and 
{X n{Ld)~i{^)} ee(o,i) converge to p as e — > provided that n{Lo) — cxo as e — 0. Since it is certain 
that n > 2/3(i — p) In ^ — )• oo as e — )■ 0, we have that |lime_j.o — ^} is a sure event. It follows 
that B = {lim^^o -^n-i = P, lime-s-o = P, liiTLe-s-o = 1} is an almost sure event. By the 
definition of the sampling scheme, we have that 
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is a sure event. Hence, is an almost sure event. Define C = |lim£_^o N(p^,s ~ ''"}■ ^® need 

to show that C is an almost sure event. For this purpose, we let a; G ^ n i? and expect to show 
that uj E C. As a consequence of to £ An B, 
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By the continuity of the function |x — ^| — pe with respect to x and e, we have 
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Making use of the continuity of the function |x — ^| — pe with respect to x and e, we have 



n(w) 



liminf ^ ^, 

8^0 N{p,e,6,C) 



> 



|lime_^O^n(a;)(w) " || " lime^^Q 



Pil-p) 



1. 



(28) 



Combining (j27p and (j28p yields lim£_j.o jv("''£^i i^) ~ -'^ thus ACi B Q C. This implies that C is 
an almost sure event and thus Pr |linie^o jy^^" g = 1 | p| = 1 for p G (0, 1). 



Next, we need to show that lime_i.o Pr{|p — p\ < e \ p} = 2^ 21n — 1 for any p G (0, 1). 
For simplicity of notations, let a = \/p{l — p) and a = w21n Note that Pr{|p — p\ < e | p} = 



Pr{|Xn —p\<£\p} = Pr{-^/n|Xn — pj/cr < e-^/n/cr}. Clearly, for any rj £ (0,a), 

Pr{ Vn|X„ - p|/cr < eVn/cr} < Pr{\/n|Xn - p|/(T < e^/n/a, e^/n/a e [a - 77, a + 77]} 

+ Pr{£Vn/cr ^ [a - 77, a + ?/]} 

< Pr{Vn|-'i^n - pI/o- < a + 77, e\/n/(j e [a — 77, a + 77]} 
+ PT{e\/n/(7 ^ [a - 77, a + 77]} 

< Pr{ Vn|Xn - p|/cr < a + 77} + Pr{£Vn/(T ^ [a - 77, a + 77]} 

and 



(29) 



Pv{^/n\Xn - p\/o- < e^/n/a} > Pr{y/n\Xn - pl/cr < e\/n/cr, ey/n/a G [a-r],a + rj]} 

> PT{^/n.\Xn — p\/a < a — T], e\/ii/a G [a — 17, a + 17]} 

> PT{^/r^\Xn - p\/cr < a-T]} - Pr{eVn/cr ^ [a - 77, a + 77]}. (30) 

Recall that we have established that n/N{p,e,6,C) — t- 1 almost surely as e — >• 0. This implies that 
£y/n/a —7- a and n/N{p,£,5,C) — t- 1 in probability as e tends to zero. It follows from Anscombe's 
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random central limit theorem [T] that as e tends to zero, ^/^{Xn — p)/a converges in distribution 
to a Gaussian random variable with zero mean and unit variance. Hence, from (|29|) . 

limsupPr{A/n|-^n — pI/c < e\/n/o"} 

< lim Pr{-v/n|-^n — < a + rj} + lim Pr{eVn/cr ^ [a — rj^a + r]]} = 2$(a + 77) — 1 

£— >0 £— i-O 



and from pO|) . 

liminf Pr{-v/n|-'^n — < e\/n/o'} 

£-S>0 

> lim Pr{-v/n|Xn — p\/a < a — rj] — lim Vv{e\/n.l a i [a — r], a + r]]} = 2$(a — r/) — 1. 
Since this argument holds for arbitrarily small rj G (0, a), it must be true that 

liminf Pr{-y/n|Xn - p|/cr < e-y/n/cr} = limsupPr{^/n|Xn - p\/cr < e-y/n/cr} = 2$(a) - 1. 

£->0 



So, lime^o Pr{|p - p| < e I p} = lime^o Pr{ VH|X„ - < e^/a} = 2$(a) -1 = 2$ In ^ j - 1 for 
any p G (0, 1). 

Now, we focus our attention to show that limg_»o ^^^^^g = 1 for any p £ (0,1). For this 
purpose, it suffices to show that 

1 - 7? < liminf ^^^\ < limsup < 1 + r?, Vp G (0, 1) (31) 

for any G (0, 1). For simplicity of notations, we abbreviate N{p, e, 6, as in the sequel. Since 
we have established Pr{lim£_j.o ^y^^" ^ = 1} = 1, we can conclude that 



limPr{(l - 7?)iV < n < (1 + r?)Af} = 1. (32) 

£-5>0 



Noting that 



E[n] = ^ mPr{n = ?7i} > ^ 771 Pr{n = m} > (1 - 7y)iV ^ Pr{n = 777}, 

in=0 (l-r;)Ar<rn<(l+?))Af (l-r;)JV<m<(l+T;)Af 

we have 

E[n] > (1 - ri)NFT{{l - ri)N < n < (1 + i])N}. (33) 
Combining ([32]) and (f33j) yields 



liminf , ^^"^ > (1 - r/) limPr{(l - r])N < n < (1 + 7])N} = 1-7]. 

£^0 N{p, e, 0, C) ^-^0 

On the other hand, using E[n] = ^^=0 ^^'{^ > "^li 'we can write 

E[n] = ^ Pr{n > 777} + ^ Pr{n > m} < [(1 + r/)A^] + ^ Pr{n > 777}. 

0<m<(l+T;)JV m>{l+ri)N m>{l+ri)N 

Since limsup^^Q jx^^j^g.^l ^ -'^ + purpose of establishing limsupg^j^o iv(p)"5 (;) — 1 + it 

remains to show that 

Em>(i+,,)iV Pr{n > m} 
hm sup — = 0. 

£^0 N{p,£,6,C) 
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Consider functions f{x) = ^ 
|/(^)- 9(^)1 ^ 





1 




Hr-i 


X 

2 





pe) and g{x) = x(l — x) for x G [0, 1]. Note that 
p£\\2x — 1\ — pe\ < pe{l + pe) 



for all X G [0,1]. For p G (Ojl)! there exists a positive number 7 < min{p, 1 — such that 
\g{x) — g{p)\ < — p) for any x G (p — 7,p + 7), since g{x) is a continuous function of x. From 
now on, let e > be sufficiently small such that pe{l + pe) < ^p{l — p)- Then, 

f{x) < g{x) + pe{l + pe) < g{p) + |j>(l - p) + pe{l + pe) < (1 + ??)p(l - p) 



for all X G (p — 7,p + 7)- This implies that 



{Xm G (P - 7,P + 7)} C Ul + r/)p(l - p) > ^ 



for all m > 0. Taking complementary events on both sides of ()34p leads to 



— 1 



pe 



(34) 



+ r7)p(l - p) < 



1 



> ^ {^m i (p-7,P + 7)} 



for all m > 0. Since (1 + r/)p(l - p) = ^^^J'^^/ < ^57^ for all m > (1 + ?7)iV, it follows that 

^ 111 ^ c ^ in c- 

Co Co 



< - 



21n-^ 4 



^ ^ (p-7,P + 7)} 



for all ?Ti > (1 + rj)N. Therefore, we have shown that if e is sufficiently small, then there exists a 
number 7 > such that 



{n > m} C 



X -i 



< 7 + 



1 me^ 1 



4 ■ 21n(C5)j 



^ {^m ^ (p-7,P + 7)} 



for all m > (1 + ry)A^. Using this inclusion relationship and the Chernoff-Hoeffding bound [2l [25] , 
we have 

Pr{n > m} < Pr{X„ ^ (p-7,p + 7)} < 2exp(-2m72) (35) 

for all m > (1 + rj)N provided that e > is sufficiently small. Letting k = [(1 + 'r])NA^ and using 
(f35]) . we have 

V Prjn > m| = V Prjn > m| < V 2exp(-2m7^) = ^^^P(~^^7 ) 
^ ^ ^ ^ J - ^ l-exp(-272) 

m>(l+r?)Ar m>fc m>k ^ ' 

provided that e is sufficiently small. Consequently, 



Em>(i+^)iv Pr{n > m} 2 exp(-2A;72) 
lim sup — r ^ — ^ < hm sup ; ^ 

.-.o"^ N{p,e,6X) ~ e-.o iVl-exp(-272) 



0, 



since A: — )■ 00 and — )• 00 as e — )■ 0. So, we have established (j3ip . Since the argument holds for 
arbitrarily small 77 > 0, it must be true that lime_i.o jvfa"g c) ~ ^^■^ ^ ^ ^-^^ '^^^^ completes 
the proof of the theorem. 
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D Proof of Theorem [4] 



Recall that I denotes the index of stage at the termination of the sampling process. Observing that 
Us - ni Pr{Z = 1} = Us Pr{Z < s] - ni Pr{i < 1} 

s 

= (ni Pr{i < £} - n£_i Pr{i < £}) 



1=2 
s 



ne (Pr{Z < £} - Pr{Z < £}) + ^(n^ - n^_i) Pr{Z < 
e=2 1=2 

s s— 1 

Y Pr{Z =t]+ - nt) Pr{Z < I], 



1=2 l=\ 

we have — X^^^^ PrjZ = ^} = — Prj^ < Making use of this result and the 

fact ?is = ni + X^^Zi (f^^+i — "ra^), we have 



E[n] = Pr{Z = ^} 



i=\ 



s-\n,-Y^^ Pr{Z =nj 

s-l 

m + ^ (n^+i - n^) - ^ (n^+i - n^) Pr{Z < 

T-l S-l 

ni + ^ (n^+i - ni) Pr{Z > ^} + X] " P^i^ > -^i- 



s-l 



(36) 



By the definition of the stopping rule, we have 

2 



1 



P^] < 7 + 



4 2 



ln(C5) } 



pe 



pe 




HC6) ] 



+ 



< 



1 



4 21n(C(5) 2 



4 ' 21n(C<5)' - 2 



u 



4 ' 21n(C5)<P^-2<^^ + ^/- + 



4 ' 21n(C5)' ^ 2 



C {ae <p^<bi}Ll{l-bi<p^<l- at} 



(37) 



for 1 < £ < s, where 6^ = ^ - /oe + + tt^^ for 



4 ^ 21n{C5) 



: 1, • • • , s — 1. By the assumption that e and 
p are non-negative, we have 1 — b£ — ai = 2pe > for i = 1, ■ ■ ■ , s — 1. It follows from (j37p that 
{Z > ^} C {p^ > a^} for £ = 1, • • • , s — 1. By the definition of r, we have p < a£ for r < ^ < s. 
Making use of this fact, the inclusion relationship {Z > C {p^ > a^}, £ = ,s — 1, and 

Chernofi'-Hoeffding bound [21 [25] , we have 



Pr{n > rii \ p} = Pr{Z > ^ | p} < Pr{p£ > ai \ p} < exp(n£./#(a^,p)) 



(38) 
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for T < £<s. It follows from (1361) and (1381) that 



T-l S-l 



E[n] < ni + ^ (n^+i - n^) + ^ (n^+i - n^) Pr{i > i} 

1=1 l=T 

S-l S-l 

= nr + - nil) Pr{i > £} < + ^(n^+i - n^) exp(n^^ 

This completes the proof of the theorem. 
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